Overview

Dataset statistics

Number of variables10
Number of observations1292
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory101.1 KiB
Average record size in memory80.1 B

Variable types

Numeric8
Categorical2

Warnings

Price is highly correlated with Age0804 and 2 other fieldsHigh correlation
Age0804 is highly correlated with Price and 1 other fieldsHigh correlation
KM is highly correlated with Price and 1 other fieldsHigh correlation
QuarterlyTax is highly correlated with WeightHigh correlation
Weight is highly correlated with Price and 1 other fieldsHigh correlation
Price is highly correlated with Age0804 and 1 other fieldsHigh correlation
Age0804 is highly correlated with Price and 1 other fieldsHigh correlation
KM is highly correlated with Price and 1 other fieldsHigh correlation
cc is highly correlated with QuarterlyTax and 1 other fieldsHigh correlation
QuarterlyTax is highly correlated with cc and 1 other fieldsHigh correlation
Weight is highly correlated with cc and 1 other fieldsHigh correlation
Price is highly correlated with Age0804High correlation
Age0804 is highly correlated with PriceHigh correlation
cc is highly correlated with WeightHigh correlation
QuarterlyTax is highly correlated with WeightHigh correlation
Weight is highly correlated with cc and 1 other fieldsHigh correlation
KM is highly correlated with HP and 2 other fieldsHigh correlation
HP is highly correlated with KM and 4 other fieldsHigh correlation
Doors is highly correlated with Weight and 1 other fieldsHigh correlation
Weight is highly correlated with HP and 4 other fieldsHigh correlation
Age0804 is highly correlated with KM and 4 other fieldsHigh correlation
QuarterlyTax is highly correlated with HP and 4 other fieldsHigh correlation
Price is highly correlated with KM and 4 other fieldsHigh correlation
cc is highly skewed (γ1 = 26.69622404) Skewed
Unnamed: 0 is uniformly distributed Uniform
Unnamed: 0 has unique values Unique

Reproduction

Analysis started2021-06-30 19:02:17.395470
Analysis finished2021-06-30 19:02:32.258308
Duration14.86 seconds
Software versionpandas-profiling v3.0.0
Download configurationconfig.json

Variables

Unnamed: 0
Real number (ℝ≥0)

UNIFORM
UNIQUE

Distinct1292
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean645.5
Minimum0
Maximum1291
Zeros1
Zeros (%)0.1%
Negative0
Negative (%)0.0%
Memory size10.2 KiB
2021-07-01T00:32:32.348472image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile64.55
Q1322.75
median645.5
Q3968.25
95-th percentile1226.45
Maximum1291
Range1291
Interquartile range (IQR)645.5

Descriptive statistics

Standard deviation373.1125835
Coefficient of variation (CV)0.5780210434
Kurtosis-1.2
Mean645.5
Median Absolute Deviation (MAD)323
Skewness0
Sum833986
Variance139213
MonotonicityStrictly increasing
2021-07-01T00:32:32.482934image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
12911
 
0.1%
4031
 
0.1%
4251
 
0.1%
4261
 
0.1%
4271
 
0.1%
4281
 
0.1%
4291
 
0.1%
4301
 
0.1%
4311
 
0.1%
4321
 
0.1%
Other values (1282)1282
99.2%
ValueCountFrequency (%)
01
0.1%
11
0.1%
21
0.1%
31
0.1%
41
0.1%
51
0.1%
61
0.1%
71
0.1%
81
0.1%
91
0.1%
ValueCountFrequency (%)
12911
0.1%
12901
0.1%
12891
0.1%
12881
0.1%
12871
0.1%
12861
0.1%
12851
0.1%
12841
0.1%
12831
0.1%
12821
0.1%

Price
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct223
Distinct (%)17.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean10732.46904
Minimum4400
Maximum31275
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size10.2 KiB
2021-07-01T00:32:32.618326image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum4400
5-th percentile6900
Q18450
median9900
Q311926.25
95-th percentile18950
Maximum31275
Range26875
Interquartile range (IQR)3476.25

Descriptive statistics

Standard deviation3622.465521
Coefficient of variation (CV)0.3375239665
Kurtosis3.240482811
Mean10732.46904
Median Absolute Deviation (MAD)1650
Skewness1.649458653
Sum13866350
Variance13122256.45
MonotonicityNot monotonic
2021-07-01T00:32:32.793299image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
8950101
 
7.8%
995074
 
5.7%
1095058
 
4.5%
795054
 
4.2%
1195041
 
3.2%
825036
 
2.8%
875035
 
2.7%
1050032
 
2.5%
775032
 
2.5%
1295028
 
2.2%
Other values (213)801
62.0%
ValueCountFrequency (%)
44001
0.1%
44501
0.1%
47501
0.1%
51501
0.1%
52502
0.2%
56001
0.1%
57401
0.1%
57502
0.2%
57511
0.1%
58001
0.1%
ValueCountFrequency (%)
312751
0.1%
310001
0.1%
249901
0.1%
249502
0.2%
245001
0.1%
239501
0.1%
237501
0.1%
230001
0.1%
229501
0.1%
227501
0.1%

Age0804
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct77
Distinct (%)6.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean55.97368421
Minimum1
Maximum80
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size10.2 KiB
2021-07-01T00:32:33.093313image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile19
Q144
median61
Q370
95-th percentile79
Maximum80
Range79
Interquartile range (IQR)26

Descriptive statistics

Standard deviation18.54930636
Coefficient of variation (CV)0.3313933434
Kurtosis-0.06457888977
Mean55.97368421
Median Absolute Deviation (MAD)12
Skewness-0.8310493009
Sum72318
Variance344.0767663
MonotonicityNot monotonic
2021-07-01T00:32:33.257745image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
6865
 
5.0%
6558
 
4.5%
8052
 
4.0%
7840
 
3.1%
6237
 
2.9%
6736
 
2.8%
7734
 
2.6%
5434
 
2.6%
7533
 
2.6%
6132
 
2.5%
Other values (67)871
67.4%
ValueCountFrequency (%)
12
 
0.2%
22
 
0.2%
42
 
0.2%
61
 
0.1%
74
0.3%
89
0.7%
93
 
0.2%
101
 
0.1%
116
0.5%
122
 
0.2%
ValueCountFrequency (%)
8052
4.0%
7927
2.1%
7840
3.1%
7734
2.6%
7626
2.0%
7533
2.6%
7428
2.2%
7330
2.3%
7220
 
1.5%
7121
1.6%

KM
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct1156
Distinct (%)89.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean68627.46827
Minimum1
Maximum243000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size10.2 KiB
2021-07-01T00:32:33.402998image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile17045.15
Q142499.25
median63831
Q387676.25
95-th percentile138790.25
Maximum243000
Range242999
Interquartile range (IQR)45177

Descriptive statistics

Standard deviation37714.61526
Coefficient of variation (CV)0.5495556839
Kurtosis1.619734528
Mean68627.46827
Median Absolute Deviation (MAD)22525
Skewness1.006002007
Sum88666689
Variance1422392204
MonotonicityNot monotonic
2021-07-01T00:32:33.558086image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
430007
 
0.5%
17
 
0.5%
360007
 
0.5%
590006
 
0.5%
750006
 
0.5%
610005
 
0.4%
600005
 
0.4%
680004
 
0.3%
370004
 
0.3%
520004
 
0.3%
Other values (1146)1237
95.7%
ValueCountFrequency (%)
17
0.5%
151
 
0.1%
2251
 
0.1%
4501
 
0.1%
15001
 
0.1%
30001
 
0.1%
40001
 
0.1%
50002
 
0.2%
52781
 
0.1%
53091
 
0.1%
ValueCountFrequency (%)
2430001
0.1%
2329401
0.1%
2181181
0.1%
2160001
0.1%
2071141
0.1%
2050001
0.1%
2042501
0.1%
2032541
0.1%
2007321
0.1%
1981671
0.1%

HP
Real number (ℝ≥0)

HIGH CORRELATION

Distinct12
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean101.4287926
Minimum69
Maximum192
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size10.2 KiB
2021-07-01T00:32:33.658304image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum69
5-th percentile72
Q186
median110
Q3110
95-th percentile110
Maximum192
Range123
Interquartile range (IQR)24

Descriptive statistics

Standard deviation15.25747138
Coefficient of variation (CV)0.1504254462
Kurtosis9.106434146
Mean101.4287926
Median Absolute Deviation (MAD)0
Skewness1.086279095
Sum131046
Variance232.7904329
MonotonicityNot monotonic
2021-07-01T00:32:33.758373image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
110745
57.7%
86230
 
17.8%
97149
 
11.5%
7266
 
5.1%
9032
 
2.5%
6931
 
2.4%
10716
 
1.2%
19211
 
0.9%
1168
 
0.6%
982
 
0.2%
Other values (2)2
 
0.2%
ValueCountFrequency (%)
6931
 
2.4%
711
 
0.1%
7266
 
5.1%
731
 
0.1%
86230
 
17.8%
9032
 
2.5%
97149
 
11.5%
982
 
0.2%
10716
 
1.2%
110745
57.7%
ValueCountFrequency (%)
19211
 
0.9%
1168
 
0.6%
110745
57.7%
10716
 
1.2%
982
 
0.2%
97149
 
11.5%
9032
 
2.5%
86230
 
17.8%
731
 
0.1%
7266
 
5.1%

cc
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
SKEWED

Distinct13
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1576.677245
Minimum1300
Maximum16000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size10.2 KiB
2021-07-01T00:32:33.873074image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1300
5-th percentile1300
Q11400
median1600
Q31600
95-th percentile2000
Maximum16000
Range14700
Interquartile range (IQR)200

Descriptive statistics

Standard deviation443.6188712
Coefficient of variation (CV)0.2813631469
Kurtosis866.6884018
Mean1576.677245
Median Absolute Deviation (MAD)0
Skewness26.69622404
Sum2037067
Variance196797.7029
MonotonicityNot monotonic
2021-07-01T00:32:33.998223image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=13)
ValueCountFrequency (%)
1600751
58.1%
1300229
 
17.7%
1400149
 
11.5%
2000107
 
8.3%
190028
 
2.2%
180013
 
1.0%
15874
 
0.3%
15983
 
0.2%
19952
 
0.2%
13982
 
0.2%
Other values (3)4
 
0.3%
ValueCountFrequency (%)
1300229
 
17.7%
13322
 
0.2%
13982
 
0.2%
1400149
 
11.5%
15874
 
0.3%
15983
 
0.2%
1600751
58.1%
180013
 
1.0%
190028
 
2.2%
19751
 
0.1%
ValueCountFrequency (%)
160001
 
0.1%
2000107
 
8.3%
19952
 
0.2%
19751
 
0.1%
190028
 
2.2%
180013
 
1.0%
1600751
58.1%
15983
 
0.2%
15874
 
0.3%
1400149
 
11.5%

Doors
Categorical

HIGH CORRELATION

Distinct4
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size10.2 KiB
5
608 
3
560 
4
123 
2
 
1

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters1292
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)0.1%

Sample

1st row5
2nd row5
3rd row5
4th row3
5th row3

Common Values

ValueCountFrequency (%)
5608
47.1%
3560
43.3%
4123
 
9.5%
21
 
0.1%

Length

2021-07-01T00:32:34.273360image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-07-01T00:32:34.358056image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
5608
47.1%
3560
43.3%
4123
 
9.5%
21
 
0.1%

Most occurring characters

ValueCountFrequency (%)
5608
47.1%
3560
43.3%
4123
 
9.5%
21
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number1292
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5608
47.1%
3560
43.3%
4123
 
9.5%
21
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
Common1292
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
5608
47.1%
3560
43.3%
4123
 
9.5%
21
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII1292
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5608
47.1%
3560
43.3%
4123
 
9.5%
21
 
0.1%

Gears
Categorical

Distinct4
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size10.2 KiB
5
1247 
6
 
42
3
 
2
4
 
1

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters1292
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)0.1%

Sample

1st row5
2nd row5
3rd row5
4th row5
5th row5

Common Values

ValueCountFrequency (%)
51247
96.5%
642
 
3.3%
32
 
0.2%
41
 
0.1%

Length

2021-07-01T00:32:34.618458image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-07-01T00:32:34.697436image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
51247
96.5%
642
 
3.3%
32
 
0.2%
41
 
0.1%

Most occurring characters

ValueCountFrequency (%)
51247
96.5%
642
 
3.3%
32
 
0.2%
41
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number1292
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
51247
96.5%
642
 
3.3%
32
 
0.2%
41
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
Common1292
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
51247
96.5%
642
 
3.3%
32
 
0.2%
41
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII1292
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
51247
96.5%
642
 
3.3%
32
 
0.2%
41
 
0.1%

QuarterlyTax
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct12
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean87.25696594
Minimum19
Maximum283
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size10.2 KiB
2021-07-01T00:32:34.783138image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum19
5-th percentile64
Q169
median85
Q385
95-th percentile185
Maximum283
Range264
Interquartile range (IQR)16

Descriptive statistics

Standard deviation40.97892647
Coefficient of variation (CV)0.4696350145
Kurtosis4.20118678
Mean87.25696594
Median Absolute Deviation (MAD)16
Skewness1.985754622
Sum112736
Variance1679.272415
MonotonicityNot monotonic
2021-07-01T00:32:34.888204image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
85555
43.0%
69502
38.9%
18588
 
6.8%
1963
 
4.9%
23418
 
1.4%
10018
 
1.4%
21016
 
1.2%
6415
 
1.2%
19712
 
0.9%
2832
 
0.2%
Other values (2)3
 
0.2%
ValueCountFrequency (%)
1963
 
4.9%
401
 
0.1%
6415
 
1.2%
69502
38.9%
722
 
0.2%
85555
43.0%
10018
 
1.4%
18588
 
6.8%
19712
 
0.9%
21016
 
1.2%
ValueCountFrequency (%)
2832
 
0.2%
23418
 
1.4%
21016
 
1.2%
19712
 
0.9%
18588
 
6.8%
10018
 
1.4%
85555
43.0%
722
 
0.2%
69502
38.9%
6415
 
1.2%

Weight
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct56
Distinct (%)4.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1072.002322
Minimum1000
Maximum1480
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size10.2 KiB
2021-07-01T00:32:35.007696image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1000
5-th percentile1015
Q11040
median1070
Q31085
95-th percentile1140
Maximum1480
Range480
Interquartile range (IQR)45

Descriptive statistics

Standard deviation50.29384602
Coefficient of variation (CV)0.04691579952
Kurtosis12.6753301
Mean1072.002322
Median Absolute Deviation (MAD)30
Skewness2.511349803
Sum1385027
Variance2529.470947
MonotonicityNot monotonic
2021-07-01T00:32:35.165713image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1075178
 
13.8%
1050144
 
11.1%
1015109
 
8.4%
103595
 
7.4%
107078
 
6.0%
102563
 
4.9%
106548
 
3.7%
108039
 
3.0%
106038
 
2.9%
110038
 
2.9%
Other values (46)462
35.8%
ValueCountFrequency (%)
100016
 
1.2%
10104
 
0.3%
1015109
8.4%
10209
 
0.7%
102563
4.9%
103021
 
1.6%
103595
7.4%
104031
 
2.4%
104527
 
2.1%
1050144
11.1%
ValueCountFrequency (%)
14803
0.2%
13203
0.2%
12801
 
0.1%
12752
 
0.2%
12703
0.2%
12651
 
0.1%
12604
0.3%
12556
0.5%
12453
0.2%
12054
0.3%

Interactions

2021-07-01T00:32:21.828162image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-01T00:32:21.968298image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-01T00:32:22.088152image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-01T00:32:22.227926image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-01T00:32:22.358177image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-01T00:32:22.583128image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-01T00:32:22.727288image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-01T00:32:22.848010image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-01T00:32:22.969708image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-01T00:32:23.093177image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-01T00:32:23.218004image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-01T00:32:23.348265image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-01T00:32:23.473001image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-01T00:32:23.598204image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-01T00:32:23.727639image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-01T00:32:23.858007image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-01T00:32:23.980103image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-01T00:32:24.123375image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-01T00:32:24.258191image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-01T00:32:24.403083image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-01T00:32:24.543152image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-01T00:32:24.678507image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-01T00:32:24.819698image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-01T00:32:24.948271image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-01T00:32:25.087941image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-01T00:32:25.219572image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-01T00:32:25.353110image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-01T00:32:25.495687image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-01T00:32:25.637982image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-01T00:32:25.775386image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-01T00:32:25.907709image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-01T00:32:26.053296image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-01T00:32:26.207579image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-01T00:32:26.377991image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-01T00:32:26.538297image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-01T00:32:26.667917image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-01T00:32:26.807730image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-01T00:32:27.088105image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-01T00:32:27.239608image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-01T00:32:27.371401image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-01T00:32:27.500206image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-01T00:32:27.753576image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-01T00:32:27.907998image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-01T00:32:28.088002image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-01T00:32:28.287375image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-01T00:32:28.443286image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-01T00:32:28.613203image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-01T00:32:28.797966image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-01T00:32:29.057442image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-01T00:32:29.238128image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-01T00:32:29.398140image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-01T00:32:29.553192image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-01T00:32:29.712547image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-01T00:32:29.838353image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-01T00:32:30.008199image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-01T00:32:30.173274image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-01T00:32:30.292972image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-01T00:32:30.418478image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-01T00:32:30.538196image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-01T00:32:30.673239image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-01T00:32:30.798138image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-01T00:32:30.923127image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-01T00:32:31.062918image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-01T00:32:31.238092image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Correlations

2021-07-01T00:32:35.367647image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2021-07-01T00:32:35.617943image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2021-07-01T00:32:35.857962image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2021-07-01T00:32:36.108293image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2021-07-01T00:32:36.348015image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2021-07-01T00:32:31.471391image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
A simple visualization of nullity by column.
2021-07-01T00:32:32.013185image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

Unnamed: 0PriceAge0804KMHPccDoorsGearsQuarterlyTaxWeight
0016450202258897140055851110
1179507557144110160055851070
2210950597966086130055851065
338950656000086130035691015
449950554453797140035691025
55167502425563110160035191065
6689005936954110160035691050
77745076154900722000551851140
88725074130025110160035691050
997950685756586130055691035

Last rows

Unnamed: 0PriceAge0804KMHPccDoorsGearsQuarterlyTaxWeight
1282128216250192944197140055851110
1283128394505210480597140035691025
1284128410950574021486130035691025
12851285127503327240110160055851075
1286128680005870560110160035691050
1287128789506181170110160045691040
1288128867506960050110160035691050
1289128974007574096110160035691050
12901290114505050400110160055851080
1291129185007466718110160035691050